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(57) Abstract 

In a system including a collection of cooperating cache servers, such as proxy cache servers, a request can be forwarded to a 
cooperating cache server if the requested object cannot be found locally. An overload condition is detected if for example, due to reference 
skew some objects are in high demand by all the clients and the cache servers that contain those hot objects become overloaded due o 
forwarded requests. In response, the load is balanced by shifting some or all of the forwarded requests from an overloaded cache server to 
a less loaded one. Both centralized and distributed load balancing environments are described. 
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LOAD BALANCING COOPERATING CACHE SERVERS 

Field of the Invention 

5 The present invention is related to load balancing among cooperating 

cache servers and in particular to load balancing based on load conditions 
and a frequency that requests are forwarded from cooperating cache 
servers . 

10 Background 

The growth in the usage of the World Wide web has been increasing 
exponentially. As a result, response times for accessing web objects can 
become unsatisfactorily slow. One approach to improving web access time 

15 is to employ one or more proxy cache servers between browsers and the 

originating web servers. Examples of proxy cache servers include a cluster 
of PC servers running Microsoft's Windows NT TM , such as the NETFINITY TM 
servers from IBM; and workstation servers running IBM' s AlXip M operating 
system, such as the IBM RS/6000 TM or SP/2 TM . In fact, more and more 

20 organizations, such as Internet Service Providers (ISPs) and corporations, 
are using a collection of cooperating proxy cache servers to help improve 
response time as well as reduce traffic to the Internet. A collection of 
cooperating cache servers have distinct advantages over a single cache 
server in terms of reliability and performance. If one fails, requests can 

25 still be serviced by other cooperating cache servers. Requests can be 

distributed among the servers, thus increasing scalability. Finally, the 
aggregate cache size is much larger so that it is more likely that a 
requested object will be found in one of the cache servers. . 

3 0 with cooperating cache servers, *a request that cannot be serviced 

locally due to a cache miss can be forwarded to another cache server 
storing the requested object. As a result, there are two kinds of 
requests that can come to a cache server: direct reguest and forwarded 
requests. Direct request are those that are received directly from 
35 clients. Forwarded requests are those that come from other cooperating 

cache servers on behalf of their clients due to cache misses on the cache 
servers, with requests forwarded among the cache servers, a cache server 
can easily become overloaded if it happens to contain in- demand (or "hot") 
objects that most clients are currently interested in, creating uneven 

4 0 workloads among the cache servers. Uneven workloads can create a 

performance bottleneck, as many of the cache servers are waiting for the 
same overloaded cache server to respond to requests forwarded to it. 
Therefore, there is a need for a way to perform dynamic load balancing 
among a collection of proxy cache servers. The present invention addresses 
45 such a need. 
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Load balancing is traditionally done by a front -end scheduler whxch 
-evenly distributes" incoming direct revests among the cache servers. For 
example, load balancing can be done at the DNS level & manipulating a 
mapping table, such as is done by the NETRA^ proxy cache by Sun 
5 Microsystems ("Proxy Cache Server, Product Overview", white paper, Sun 
Microsystems, http://www.sun.com/) . Load balancing among a cluster of 
servers can also be done with a front -end router, such as the 

« ~a -k,, t-rm / cap e a . G . Goldszmidt and G. Hunt, 
NETDISPATCHERtm offered by IBM (see e.g., 

"NetDispatcher: A TCP Connection Router," IBM Research Report, RC 20853, 
10 Z 1 " Here, incoming revests are distributed by the 

to the least loaded server in the cluster. However, these tradxtxonal 
approaches distribute only "direct requests" and do not address a load 
balance problem resulting from too many requests for hot objects being 
simultaneously forwarded to the same proxy server. The present xnventxon 
15 addresses such a need. 

cooperative caching, or remote caching, has been used in distributed 
file systems to improve system performance (see "Cooperatxve cachxng-. 
Using Remote Client Memory to Improve File System Performance, "by M. P. 

— - - r; °ii ^rirrss zzzzxzL- - 

overall file cache. Each workstation caches not only objects referenced by 
real revests but also objects that may be referenced by requests from a 
remote workstation, upon a local cache miss, a local req* .est can^be sent 
to other client workstations where a copy can be obtaxned. xf 
otherwise the object is obtained from the object server. The emphasxs 
Tere 7s main" how to maintain cache coherency in the face of updates and 
now to maintain cache hit ratios by moving a locally replaced object to 
the cache memory of another workstation. There is no dynamxc load 
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35 



40 



balancing. 



45 



Cooperative caching is also used in collective proxy cache --vers 
to reduce the access time. Upon a cache miss, instead of goxng dxrectly to 
he originating web server potentially through a WAN , a cache se rve ; :^ay 

. n> ,^ rt . from a cooperating cacne server 

forward the request to obtain the oboect rrom a tuuy * 
iTTS- or a regional area network. For example, upon a local cache mxss 
in the SQUID system, a cache server multicast* a request (usxng the 
xntemet Cache Protocol (ICP) ) to a set of other cache servers ( see 
"Squid internet Object Cache", by D. Weasels et al . , 
http://squid.nlanr.net/). If their caches contain the requested 
these cooperating cache servers reply with a message indxcatxng .uc*. The 
requested object is then obtained from the cooperating cache server whxch 
responded first to the request, instead of from the origxnal web server on 
the xntemet. However, if none replies after a time-out perxod, then the 
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requested object will be fetched from the originating web server. Load 
imbalances can occur at a cache server due to forwarded requests. 

instead of multicasting, the CRISP system uses a logical central 
5 directory to locate an object cached on another proxy server (see 

"Directory Structures for Scaleable Internet Caches", S. Gadde et al - , 
Technical Report CS-1997-18, Dept. of Computer science, Duke University, 
1997) . Here, upon a cache miss, a cache server asks the directory server 
for the object, with central knowledge of the caches object storage, the 

10 directory server sends such a request to the server whose cache includes 
the object, if found, the object is then sent to the requesting server 
while the original server continues to cache the object. If no cache has a 
copy of the requested object, the requesting server obtains the object 
from the originating web server through the Internet (potentially through 

15 a WAN) . Again, this can create a load imbalance at the cache server due to 
subsequent requests forwarded to this cache server. 

Yet another way to locate an object on a cooperating cache server is 
through a hash function. An example is the Cache Array Routing Protocol 
20 (CARP) (see V. Valloppillil and K. w. Ross, "Cache Array Routing Protocol 

vl.O," internet Draft, 

http://ircache.nlanr.net/Cache/lCP/draft-vinod-carp-vl-03.txt, Feb. 1998) . 
in CARP, the entire object space is partitioned among the cooperating 
cache servers, with one partition for each cache server, when a request 

25 is received by a cache server from a configured client browser, a hash 
function is applied to a key from the request, such as the URL or the 
destination IP address, to identify the partition. If the hash partition 
is the assigned to requesting cache server, then the request is serviced 
locally. Otherwise, it is forwarded to the proper cache server in the 

30 identified partition. 

SQUID, CRISP and CARP use the caches of other proxy servers to 
reduce the possibility of having to. go through the WAN for a missed 
object. They differ in the mechanism for locating a cooperating cache 

35 server whose cache may contain a copy of the requested object. Each cache 
server services two kinds of requests: direct requests and forwarded 
requests. Direct requests are those made directly from the browsers 
connected to the proxy server. Forwarded requests are those made by 
cooperating cache servers whose caches do not have the requested objects. 

40 in any event, depending on the types of objects a proxy server caches at a 
given moment, its CPU could be overloaded because it is busy serving both 
direct and forwarded requests . 



45 
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summary of the I nvention 

in accordance with the aforementioned needs, one aspect of the 
present invention provides a cache server load balancing method, 
comprising the steps of: receiving forwarded requests from a cooperating 
cache server in response to a cache miss for an object on the -operating 
cache server; and shifting one or more of said forwarded requests for the 
object between cooperating cache servers based on a load condition and a 
forwarding frequency for the object. 

Another aspect of the present invention provides a method of load 
balancCin a collection of cooperating cache servers, where each cache 
server can receive direct requests and forwarded requests, and upon a 
cade miss, a request can be forwarded to an owning cache server caching 
said object, the method comprising the steps of: monitoring a load 
condition and a forwarding frequency for said cooperating cache servers, 
and lifting one or more forwarded requests from one cooperating cache 
server to a second cooperating cache server based on a change in the load 
condition and the forwarding frequency. 

For example, in a system including a collection of cooperating proxy 
cache servers, a request can be forwarded to another cooperating server if 
"e requested object cannot be found locally, instead of 
^ ect from the originating web server through the internet, a cache 
server can obtain a copy from a cooperating cache server in a local area 
network or an intranet. The average response time for access to an object 
can be significantly improved by the cooperating cache server. However, 
Te to reference skew, some objects can be in high demand by all the 
clients As a result, the proxy cache servers. that contain those hot 
ob let 'can become overloaded by forwarded requests coming «jath«r 
oroxv cache servers, creating a performance bottleneck. According to the 
Tre-nt invention, we propose a load balancing method for a collection of 
derating proxy cache servers- by shifting some or all^of ^ forwarded 
requests from an overloaded cache server to a less loaded one. 

An example of a cache server load balancing method in accordance 
with the present invention includes the steps of: receiving ^rward-d 
requests from a cooperating cache server in response to a cache miss for 
aTob ^ct on the cooperating cache server; and shifting one or more of the 
forwarded requests for the object between cooperating cache servers based 
on a load condition and a forwarding frequency for the object. 

The present invention also- includes features for periodically 
monitoring the load condition on and the forwarding frequency to the 
owning cache server; and proactively shifting one or more subsequent 
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forwarded requests for the cached object from the owning cache server to 
one or more of the cooperating cache servers, in response to the 
monitoring. Alternatively, the shifting step further includes the step of 
checking the load condition and forwarding frequency, in response to the 
5 receipt of a forwarded request. In one example, the load condition of the 

cooperating cache server is a weighted sum of a count of said forwarded 
requests, and a count of direct requests to said cooperating cache server. 
.In another example, the cache information is maintained at: each object 
level; or a partition of objects level. 

10 

The present embodiment includes various implementations for 
performing the load balancing, including both centralized and distributed 
environments and various hybrids thereof. For example, a distributed load 
monitor can be used for monitoring and maintaining a local load condition, 

15 the forwarding frequency and ownership information for cached objects on 

each cooperating cache server. The cooperating cache servers can 
periodically exchange and maintain one or more of: the load condition 
information; the forwarding frequency; and the ownership information. For 
example, the cooperating cache servers can exchange information by 

20 piggybacking one or more of: the load condition information; the 

forwarding frequency; and the ownership information, with one or more of 
the forwarded requests and responses. 

in another example, an overloaded cooperating cache server can 
25 identify a less loaded cooperating cache server; and communicate a shift 

request and a copy of the cached object to the less loaded cooperating 
cache server (which then caches the object), so that subsequent requests 
for the object will not be forwarded. Alternatively, an overloaded 
cooperating cache server can communicate the shift request to the less 
3 0 loaded cooperating cache server, which then obtains a copy of the object 

from an originating object server, in response to the shift request, in 
yet another alternative, the owning cache server can multicast the shift 
request message to one or more of the other cooperating cache servers so 
that subsequent forward requests will be shifted. 

35 

in a fully distributed implementation of the present invention, the 
cooperating cache servers can each include a distributed load monitor for 
monitoring and locally maintaining load conditions, and also can maintain 
the forwarding frequency and ownership information in a local copy of a 
40 caching table or by means of a hashing function. The cooperating cache 

servers can modify the ownership information by means of the local copy of 
the caching table or the hash function. 

The present embodiment includes still other features for modifying 
45 the ownership for the object to a shared ownership between at least two of 
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the coopers cache servers and forwarding subsequent ^™es^s to 

or more less loaded shared owners of the object. If a decrease m tn 
one or more less 10a dete cted, the shared ownership can 

load condition for a shared object is detected, c 
be merged, in response to the decrease in the load condition. 

In ye t another example, the shifting of one or more of the forwarded 
7. based on the load condition an the forwarding frequency can be 
requests based on tne ±o* obiect from the owning cache 

l0 resets will not be forwaraea «.s lon g aa tbe ob,ect re».rna ,n tne 
recipient's cache). 

An example of a centralized environment in accordance with the 
An example or ^fr-alized logical load monitor for 

oresent embodiment includes: a centralized logic 

cooperating cache servers. Tne raon itoring the load on 

™r="r.T.r~ «... >. — - »■ — - - 

25 object locations. 

Bgj^f DescriB fcign of the Drawings 

ascription, appanaaa .>.<-. ^ accompanying ar.w^a wbere^. 

Fi,„re la ahowe an example of a aystem in a bloc* diagram form 
employ aH collection of proxy cacba servera. wberein a centra .aea load 

j „^ .-o fho nresent invention can be applied, 
35 balancing logic according to the present i 

Figu .e lb shows another example of a system in a "«*.^.»f«» 
employing a collection of proxy cache servers, where a disc ed load 
balancing logic according to the present invention can be applied, 

Pigures 2a-b show examples of data formats for two tables that can 
be maintained by the load monitor depicted in Figures la-b; 

Fig ure 3 shows an example of a logic flow for the load pter in 
45 response to a request from a cache server because of a cache miss, and 
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Figure 4 shows an example of a logic flow for a cache server in 
response to a request for an object. 

Detailed Description 

Examples of the load balancing logic of the present embodiment will 
be described for both centralized and distributed architectures. Figure la 
shows an example of a block diagram of a system employing a collection of 
proxy cache servers, where a centralized load balancing logic proposed in 
this invention can be applied. As depicted, the system includes a 
collection of proxy cache servers 150. Although only a single level of 
cache server is depicted, there could be a hierarchy of cache servers 150. 
As is conventional, these proxy cache servers are connected with each 
other through a local area network (LAN) or a regional area network or 
intranet 140. Each cache server 150 is also connected to a wide area 
network (WAN) or the Internet 110. Through the WAN, these proxy cache 
servers can reach 115 the originating web servers for objects that cannot 
be found locally on their own caches. 

According to the present embodiment a logical load monitor 120 
includes a load balancing logic 13 0 for monitoring the load conditions and 
forwarding frequency (Fig. 2a) of the cooperating cache servers 150 and 
provides load balancing for them. As will be described below, various load 
monitor 120 features can: reside in one or more of the cache servers; be 
duplicated and distributed among the cache servers; or reside in another 
dedicated system such as a personal computer (PC) server or workstation, 
in a centralized system configuration, the load monitor 120 can perform a 
central directory function in directing forwarded requests 125 to the 
cache servers. One or more browsers 160 can be configured to connect to 
each cache server 150. Direct requests 155 are sent from the clients such 
as computers running conventional browsers 160 to the configured cache 
server 150. If the requested object can be found locally, then it is 
returned to the browser. Otherwise, the cache server 150 communicates a 
message to the load monitor 120. various example implementations of the 
load monitor 120 will be described in more detail below. If no load 
imbalance condition or trend exits, the load monitor 120 then forwards the 
request 125 to the cache server 1.50 that owns the requested object. The 
owning cache server then sends the requested object to the requesting 
cache server, e.g., via the LAN 140. 

If an actual load imbalance is identified, or predicted based on a 
loading trend, the load monitor 120 initiates a shifting of forwarded 
- - requests from the overloaded- cache server to one or more underloaded (or 
less loaded) servers. As will be described in more detail below, the 
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shifting of ownership can be based on the load condition of the servers 
150 and the forwarding frequency, as well as other factors. 

Figures 2a-b shows examples of data formats of^two tables maintained 
by the load monitor. As depicted, the tables include a load ™ ^ 

a caching table 101. One skilled in the art will apprecxate that a sxngle 
table or various other data structures could alternatively or 
equ v lently be used. The load table 102 includes the loac ~™ 1021 

a cache server can be a weighted sum of the number of forwar ^ revests 
and the number of direct requests. An overloaded cache server 50 can be 
TL ti xedby any conventional techniques, e.g., the load ~ 
compute the mean load of all proxy cache servers xn past - te ™^. 
rT^Led cache servers can be those with loads exceedxng a threshold 
^ r "I ™oL. According to the present -^^^^ 
taxes into account the amount of overloading as well as the load due to 
tS forwarding frequency 1011 of the cached objects Thxs way the load 
Monitor can decide whether or not to continue shifting , J- 
forwarded requests from an overloaded cache server C 10213 to an 
forwarded req cach ing table 1010 includes the 

underloaded server A 10211. The cacnxng c = n obiect or a 

frequency 1011 and ownership 1012 information of an object 
£ orward,ng frequency 10 discussed below , the ownership can be 

paraxon of objects As w ^ ^ ^ ^ coQperating 

^heYervers T he Warding frequency 1011 represents the number of 
cache s e rver s . ^® has been forwarded through the load monxtor. 

also maintain a timestamp 1013, indicating the most recent txme a request 

Fri;r:ru,s =5= s— S - 

iripntif iers, or can be based on the directory 
a hash function on obiect identifiers, 

information such as the time stamp informatxon. 

Figure 3 shows an example of a logic flow for steps taken by the 

., Err rr=^jrrri-js==rs r= 
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server is assigned as its owner. After the entry is located in the caching 
table, in step 203, the forwarding frequency 1011 is updated, e.g., 
incremented by 1. The load monitor then examines the^ load table 102 to 
see if the owner is currently overloaded (and that the forwarding 
frequency 1011 is a significant contributor thereto) , in step 204. If yes, 
in step 205, the load monitor finds an underloaded (or less loaded) cache 
server and assign it as the new 10122 (or shared) owner 10122 of the 
requested object. The ownership information 1012 for the object in the 
caching table 101 is updated accordingly. Those skilled in the art will 
appreciate that the logic flow could comprise a shared 10123 or 
hierarchical ownership 1012 in the caching table 101 or other data 
structure employed. The request (possibly with a copy of the requested 
object) can then be forwarded 125 to a new sole 10122 (or shared 10123) 
owner, in step 206. Alternatively, the new owner can be requested to 
obtain 115 an object copy from the originating object server, e.g., via 
the internet 110. Those skilled in the art will appreciate that the load 
checking step 204 can be performed proactively, i.e., periodically or in 
response to an identified overload or overload trend 1021 - due at least 
in part to a high forwarding frequency 1011 - for a given object 
id/partition id 1010 and cache server (ownership 1012) . If so, then in 
step 205, the load monitor finds an underloaded (or less loaded} cache 
server, assigns it as the new (or shared) owner of the requested object, 
and possibly sends a copy of the object to the new (or shared) owner as 
above. Conversely, if a shared ownership model is used, in step 208, when 
the load condition 10211 and forwarding frequency 10111 for a shared 
ownership object (p 10101) drops below a predetermined threshold, in step 
209, the shared ownership (B, A 10121) can be merged to a single ownership 
and one of the copies purged from one of the cache servers A 10121, e.g., 
to make room for another hot object. 

Figure 4 shows an example of a logic flow for a cache server when a 
request for an object is received, either directly 155 from a browser 160 
or forwarded 125 from the load monitor 120. As depicted, in step 301, it 
first checks to see if the requested object can be found locally in its 
cache. If yes, in step 302, it returns the object and the process ends, in 
step 306. Otherwise, in step 303, it checks to see if the request is a 
direct request or a forwarded request. If it is a direct request, in step 

304, the request is sent to the load monitor and the process ends, in step 
306. On the other hand, if the request is a forwarded request, in step 

305, the cache server will fetch the object from the originating web 
server and return the object. The process then ends, in step 306 . 

-- - .- Referring- now to. Figures, la and 2a-b,. assume for example, a browser 
160 connecting to a cache server C 10223 requests 155 an object p 10101. 
From the caching table 101, it can be seen that object p 10101 is not 
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cached on server C. but it is cached cn ("owned" by) cache server B 
(assuming B, A 10121 is initially only solely designated by B) . In 
response to a cache miss on object p. server C 10223^sends a revest 
the load monitor 120 for object p. Depending on the load condition 10212 
and forwarding frequency 1011 of revests for p 10101 on **™ ^ 
load monitor may forward the revest to server B, asKxng it to send a copy 

n nr if server B is currently overloaded or is 
of nh-ipct d to server C. Or, ir seivt:i D 

trending as such, the load monitor »ight shift the forwarded reguest by 

..... *> — «~ ^ ° bject - r:? :r 

even after the transfer of ownership. . copy of ob 3 e=t p is still on 
seLr "s each. and oan still serve direct reguests co„ ng » 
server o . forwarded requests for obDect p (or 

„, in this example '.^ a L ers tip, will ba shifted to server 
perhaps some, m the case or a a>n&j.e fllHirp 
. , 4« *-v^ rase of shared ownership B, A 10121, future 
* at hprnativelY/ m trie case ul au&j.^ 

^araaTra^sts for object p 1.1.1 can ba sant to tha lass loaded 



server. 
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„ow that a load balancing method according to tha 
h .s baan dascribad for a collaction of proxy cacha 

antral directory is used for locating an object, various alternatives 

tZTZ*. The present invention can be adapted to achieve load 
balancing for these systems as well- 

For example, the present invention can be configured to perform load 
balancing roH collection of cooperating proxy cache servers where each 

= rr 0 ad3ttr 5 trr.r^^^ 

"is information with each cache server 15.. The load balancing can be 

I^ietea by excluding overloaded servers from the list 
severs to which a cache server multicast* its request (also called a 
shift reguest,. As a result, only less loaded cache servers will receive 
forwarded requests 125. 

mother alternative is a load balancing method for a °* 
coooeratlng proxy cache servers where a hash function is used to locate a 
copy of a locally missed object, xn this case, the object space can be 
Partitioned among the -cooperating proxy -cache servers 150, with one 
parti"- for each cache server. Xn order to achieve load balancing by 
shifting forwarded revests, one can change the hash function so that 
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forwarded requests will not go to overloaded servers. One preferred 
approach is to hash the object space into a large number of buckets, much 
larger than the total number of proxy cache servers ..These hash buckets 
are then assigned to the cache servers, with the goal of balancing the 
loads among them. Periodically, one can move one or more hash buckets from 
one overloaded server to an underloaded server, effectively changing the 
hash function. 

in either case, the load condition of the cooperating cache server 
can factor in the forwarding frequency directly into the calculated load 
condition. For example, the load condition can be a weighted sum of a 
count of said forwarded requests, and a count of direct requests to said 
cooperating cache server. Alternatively, the load monitor could separately 
maintain the overall forwarding frequency for each cooperating cache 
server. 

Referring now to Figures lb and 2a-b, yet another alternative is a 
load monitor 120 that is distributed, i.e., wherein some or all the load 
monitor is duplicated across the cache servers 150. in one example, the 
distributed load monitor includes local load condition information 1021 
(and as described below, possibly the load conditions of all (A, B, C, . . . 
1022)) of the cooperating cache servers 150. The distributed load monitor 
120' preferably also includes the caching table 101 with the forwarding 
frequency 1011 and ownership 1012 information for each object id/partition 
id 1010. Alternatively, a hashing function, for example as described 
above, could be distributed and stored in the cache servers. Load 
condition information 1021 and/or caching information 101: can be 
exchanged periodically; when there is a change in status (ownership or 
significant change in load condition) ; or piggybacked with cache 
forwarding requests and responses. Load condition 1021 information could 
also have a time stamp (not shown) associated with it for tracking or 
other purposes . 

Here, if a cache server 150 has a cache miss, the local load monitor 
120' looks up the ownership of the requested object in its local caching 
table 101 and forwards the request, to the owning cache server. 
Alternatively, the hash function could be applied to a key from the 
request, such as the URL or the destination IP address, to identify the 
partition and the request then forwarded to. the correct cache server. When 
the forwarded request (i.e., from a cache server who had a cache miss) is 
received, the owning cache server identifies it as a forwarded request 
(e.g., by identifying it as from another cache server as opposed to a 

- client) and -updates its- forwarding- frequency -ID 11- information, as 

applicable (Fig. 3, step 203). if an overload trend or condition is 
indicated (step 204), the owning cache server can respond to the 
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requesting cache server with a shift request and a copy of the cached 
object. Alternatively, the requesting cache server can obtain a copy from 
the originating object server via an intranet, WAN or, internet 110. in 
either case, when the forwarding server caches a copy of the object, this 
server will no longer issue forward requests (steps 301, 302) as long as 
it remains in the cache, thus proportionally reducing the load on the 
owning server. In addition, the owning cache server can multicast a shift 
request message to one or more of the other cooperating cache servers 150 
so that subsequent forward requests will be shifted, e.g., by updating 
their local copy of the caching table or modifying the hash function (step 
205). At this point, other cache servers can forward their requests to the 
new owner (or to the least loaded owner of two or more cache servers 150 
if ownership is shared) as indicated in their local copy of the caching 
table 101. When the original cache owner's load has decreased to an 
acceptable level (step 204), e.g., as indicated by a threshold, the shared 
ownership information can be merged to its original state (e.g., B.A 10121 
- -> B) . 

in the case that the load condition information 1021 for all cache 
servers ( A,B,C ... 1022) is fully distributed, the requesting cache 
server could proactively check the load condition (and associated time 
stamp) of the owning server (step 204), i.e., before forwarding the 
request. If overloaded, the requesting server could request a copy of the 
object from the owning server (or from the originating server via the 
intranet or internet 110) and possibly a load condition confirmation. The 
owning cache server could update its caching table 101 or modify the hash 
function to indicate the new shared ownership (step 205) . The ~^ing 
server (or the owning server) could then multicast a message to all other 
cache servers 150 indicating the new shared ownership of the object and 
possibly include an updated load condition. At this point, other cache 
servers would update their caching tables 101 or modify the hash function 
to indicate- the new shared ownership (step 202), and can forward their 
requests (step 206) to the least loaded owner of two or .ore cache servers 
150 sharing ownership as indicated in their local copy of the caching 
table 101. when a shared cache owner's load has decreased to an acceptable 
level (steps 204 and 208), e.g., as indicated by a threshold, the 
ownership information can be merged to its original state, m step 209. 

A preferred embodiment of the present invention includes features 
that can be implemented as software tangibly embodied on a computer 
program product or program storage device for execution on a processor 
(not shown) provided with cache server 150 or other computer embodying the 
load monitor -120,- such as in -the centralized model described. For 
software implemented in a popular object-oriented computer executable code 
such as JAVA provides portability across different platforms. Those 
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skilled in the art will appreciate that many other compiled or 
interpreted, procedure -oriented and/or object-oriented (OO) programming 
environments, including but not limited to REXX, C, C++ and Smalltalk can 
also be employed. 

Those skilled in the art will also appreciate that methods of the 
present embodiment may be the software may be embodied on a magnetic, 
electrical, optical, or other persistent program and/or data storage 
device, including but not limited to: magnetic disks. Direct Access 

10 Storage Devices (DASD) , bubble memory; tape; optical disk formats such as 
CD-ROMs and DVD; and other persistent (also called nonvolatile) storage 
devices such as core, ROM, PROM, flash memory, or battery backed RAM. 
Those skilled in the art will appreciate that within the spirit and scope 
of the present invention, one or more of the components instantiated in 

15 the memory of the server 120' could be accessed and maintained directly 

via disk (not shown) , the network, another server, or could be distributed 
across a plurality of servers. 

in summary, in a system including a collection of cooperating cache 
servers, such as proxy cache servers, a request can be forwarded to a 
cooperating cache server if the requested object cannot be found locally. 
An overload condition is detected if for example, due to reference skew, 
some objects are in high demand by all the clients and the cache servers 
that contain those hot objects become overloaded due to forwarded 
requests, in response, the load is balanced by shifting some or all of the 
forwarded requests from an overloaded cache server to a less loaded one. 
Both centralized and distributed load balancing environments are 
described. 

While we have described our preferred embodiments of our invention 
with alternatives, it will be understood that those skilled in the art, 
both now and in the future, may make various improvements and enhancements 
which fall within the scope of the claims which follow. *hese claims 
should be construed to maintain the proper protection for the invention 
35 first disclosed. 
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CLAIMS 

1. A cache server load balancing method, comprising the steps of: 

receiving forwarded requests from a cooperating cache server in 
response to a cache miss for an object on the cooperating cache server; 
and 

shifting one or more of said forwarded requests for the object 
between cooperating cache servers based on a load condition and a 
forwarding frequency for the object. 

2. The method of claim 1, said shifting step further comprising the 
steps of : 

periodically monitoring the load condition on and the forwarding 
frequency to an owning cache server; and 

proactively shifting one or more subsequent forwarded requests for 
the cached object from the owning cache server to one or more of said 
cooperating cache servers, in response to said monitoring. 

3 The method of claim 1 or 2, said shifting step further comprising 
the step of checking the load condition and forwarding frequency, in 
response to the forwarded request. 

4 The method of claim 1, 2 or 3, wherein said shifting comprises the 
step of modifying an ownership for the object to a shared ownership 
between two or more of said cooperating cache servers. 

5 The method of claim 4, further comprising the step of merging said 
shared ownership in response to change in the load condition. 

6 The method of any of claims 1 to 5 , further comprising the step of 
locally monitoring the load on each cooperating cache server. 

7. The method of claim 6, further comprising the step of: 

a distributed load monitor monitoring and maintaining a local load 
condition, the forwarding frequency and ownership information for cached 
objects on said each cooperating cache server. 

- - - -8-- - - - -The -method -of-el-aim- 7- ,- further comprising- the - steps, of j _ . . . . 
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said cooperating cache servers periodically exchanging and 
maintaining one or more of: the load condition information; the forwarding 
frequency; and the ownership information. ^. 

5 9. The method of claim 7, further comprising the steps of: 

said cooperating cache servers exchanging by piggybacking one or 
more of: the load condition information; the forwarding frequency; and the 
ownership information; with one or more of the forwarded requests and 
10 responses . 

10. The method of any of claims 1 to 9, further comprising the step of: 
receiving a forwarded request and updating the forwarding frequency. 

15 11. The method of claim 7, 8, 9 or 10, further comprising the steps of: 

identifying a less loaded cooperating cache server; and 



20 



40 



communicating one or more of: a shift request; and a copy of the 
cached object, to said less loaded cooperating cache server. 



12. The method of claim 11, further comprising the steps of: 

said less loaded cooperating cache server receiving said shift 
25 request; and 

said less loaded cooperating cache server requesting a copy of the 
object from an originating object server, in response to said shift 
request. 

13. The method of claim 11, wherein the copy is obtained via one or more 
of an intranet , WAN or internet . 

14. The method of any of claims 1 to 13, further comprising the step of 
35 multicasting a shift request message to one or more of the other 

cooperating cache servers so that subsequent forward requests wxll be 
shifted. 



15. The method of claim 14, further comprising the step of: 

the cooperating cache servers maintaining one of . a local copy of a 
caching table and modifying a hash function; and 
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the cooperating cache servers modifying the ownership information by 
one of: updating a local copy of a caching table; and modifying a hash 
function, 

16. The method of claim 15, further comprising the steps of: 

modifying the ownership for the object to a shared ownership between 
at least two of said cooperating cache servers; and 

said cooperating cache servers forwarding subsequent object requests 
to one or more less loaded shared owners of the object. 

17. The method of claim 16, further comprising the steps of: 

detecting a decrease in the load condition for a shared object; and 

merging the shared ownership, in response to the decrease in the 
load condition. 

20 18. The method of any of claims 1 to 17, wherein said shifting one or 
more of said forwarded requests comprises the steps of: 

communicating a copy of the object from an owning cache server to 
one or more of said cooperating cache servers; and 

said cooperating cache server receiving and caching the copy of the 
ob j ect . 

19. The method of any of claims 1 to 18, further comprising the steps 
30 Of: 

calculating the load condition of each cache server in past 
intervals; 

computing a mean load of all cache servers in past intervals; and 
finding the cache servers that exceed a threshold above said mean 

load. 

20. The method of any of claims 1 to 19 , wherein the load condition of 
said cooperating cache server can be a weighted sum of a count of saxd 
forwarded requests, and a count of direct requests to said cooperating 
- -cache* server 7 
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21. The method of any of claims 1 to 20, further comprising the step of 
maintaining cache information at one or more of: each object level; and a 
partition of objects level. «»..- 

22. The method of claim 21, wherein said cache information of said 
object- level or said partition comprises the forwarding frequency 
associated with the object. 

23. The method of claim 22, further comprising the step of: 

a distributed load monitor monitoring and locally maintaining load 
conditions, forwarding frequency and ownership information for cached 
objects on each cache server. 



24 



The method of claim 23, further comprising the steps of: 



said cooperating cache servers periodically exchanging one or more 
of the load condition, the forwarding frequency and the ownership 
information. 

The method of claim 22,23 or 24 further comprising the steps of: 



25 



said cooperating cache servers exchanging by piggybacking one or 
more of: the load condition; the forwarding frequency; and the ownership 
information; with one or more of the forwarded requests and responses. 

26 A method of load balancing in a collection of cooperating cache 
servers, where each cache server can receive direct requests and forwarded 
requests, and upon a cache miss, a request can be forwarded to an owning 
cache server caching said object, the method comprising the steps of: 

monitoring a load condition and a forwarding frequency for said 
cooperating cache servers; and 

shifting one or more forwarded requests from one cooperating cache 
server to a second cooperating cache server based on a change in the load 
condition and the forwarding frequency. 

27. The method of claim 26, wherein said step of monitoring the load 
condition comprises the steps of: 

calculating the load condition of each cache server in past 
rnterva-ls-,- - — - - 

computing a mean load of all proxy cache servers in past intervals; and 
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finding those proxy cache servers that exceed a threshold above said 
mean load. 

„ * -.^m ?a o-r 27 wherein said shifting step can be 
oq «?hp method of claim 26 or WUCICXli „^ ■ ^ 

28. The metnoa ■ f . said forwarded requests from said 

and the forwarding frequency. 

^ a ^ maim 26 or 27, further comprising the step of a 
cLra";^ lo^tor maintaining the forwarding « and 
the load condition for the cooperating cache servers. 

. -, • o« 97 28 or 29 wherein the load condition of 

30. The method of clam 26, 27 28 or 29 forwarde d requests; 
said cache server can be a weighted sum of . a count o 
and a count of direct requests to said cache server. 

* n ■„ o7 28 or 29 further comprising the step of 

31. The method of claim 26, 27, 2 » « partition of 
maintaining cache information at each object level o 

objects level. 

32 The method of claim 31, wherein said cache information of the object 
fe^l or the partition level comprises the forwarding frequency of 
requests through said load monitor to said object. 
25 33 The method of any of claims 26 to 32, wherein said cooperating cache 
servers comprise cooperating proxy cache servers. 

T he method of any of claims 26 to 32, further comprising the steps 

a logical directory server maintaining a caching table and a load 
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34. 
of: 



table; 



said cache servers interrogate said directory server for object 
iooations in other cache servers for a locally mrssed object, and 

said directory server load balancing requests among said 
servers "y manipulating said caching table, in response to regnests for 
object locations. 

35. The method of claim 29, further comprising the steps of: 

e -aWcache--sBrver^ 

servers to locate a copy of a locally missed object; and 



45 



BNSDOCID- <WO 0022526A1 I > 



WO 00/22526 



19 



PCT/GB99/03360 



said shifting step comprising the step of excluding overloaded cache 
servers from a subset of neighboring cache servers for multicasting. 

36. A program storage device readable by a machine, tangibly embodying a 
5 program of instructions executable by the machine to perform method steps 
for cache server load balancing, said method steps comprising: 

receiving forwarded requests from a cooperating cache server in 
response to a cache miss for an object on the cooperating cache server; 
10 and 

shifting one or more of said forwarded requests for the object 
between cooperating cache servers based on a load condition and a 
forwarding frequency for the object. 

15 
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